Simulation 1

Data structure: \(O = (W, A, Y)\)

  • U - exogenous variables
  • W - baseline covariate that is a measure of body condition
  • A - treatment level based on W, continuous between 0 and 5
  • Y - outcome, indicator of an event

Underlying data generating process, \(P_{U,X}\)

  • Exogenous variables:
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 1^2)\)
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 2^2)\)
    • \(U_Y \sim Uniform(min = 0, max = 1)\)
  • Structural equations F and endogenous variables:
    • \(W = U_W\)
    • \(A = bound(2 - 0.5W + U_A, min=0, max=5)\)
    • \(Y = \mathbf{I}[U_Y < expit(-5 + W + 2.25A -0.5WA)]\)

Outcome of interest: \(E_0[Y|a,W]\)

##        W                   A                Y         
##  Min.   :-3.920105   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:-0.672772   1st Qu.:0.5787   1st Qu.:0.0000  
##  Median : 0.000686   Median :2.0133   Median :0.0000  
##  Mean   : 0.003438   Mean   :2.1197   Mean   :0.4689  
##  3rd Qu.: 0.676795   3rd Qu.:3.4294   3rd Qu.:1.0000  
##  Max.   : 3.616531   Max.   :5.0000   Max.   :1.0000
## Summary of A given W < -1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.365   2.753   2.685   4.083   5.000
## Summary of A given -1 < W <= 0:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.8798  2.2863  2.3160  3.6474  5.0000
## Summary of A given 0 < W <= 1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.4394  1.7482  1.9473  3.1546  5.0000
## Summary of A given 1 < W:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   1.103   1.516   2.554   5.000

n = 200

## [1] "The average fitting time for CV-HAL: 4.8179 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 7.3648 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 31.9061 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.001  0.002  0.003 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.007  0.008  0.023  0.044  0.064  0.108  0.178  0.238  0.351  0.466  0.632

n = 500

## [1] "The average fitting time for CV-HAL: 5.3619 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 10.0429 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 54.3114 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.000  0.003  0.005  0.016  0.023  0.038  0.059  0.103  0.153  0.221  0.268

n = 1000

## [1] "The average fitting time for CV-HAL: 5.8012 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 11.4634 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 62.6101 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.000  0.001  0.001  0.007  0.010  0.016  0.027  0.047  0.073  0.104  0.135

Simulation 2

Data structure: \(O = (W, A, Y)\)

  • U - exogenous variables
  • W - baseline covariate that is a measure of body condition
  • A - treatment level based on W, continuous between 0 and 5
  • Y - outcome, indicator of an event

Underlying data generating process, \(P_{U,X}\)

  • Exogenous variables:
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 1^2)\)
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 2^2)\)
    • \(U_Y \sim Uniform(min = 0, max = 1)\)
  • Structural equations F and endogenous variables:
    • \(W = U_W\)
    • \(A = bound(2 - 0.5W + U_A, min=0, max=5)\)
    • \(Y = \mathbf{I}[U_Y < expit(-10 + 2W + 5sin(A^{1.5}) + 2WA)]\)

Outcome of interest: \(E_0[Y|a,W]\)

##        W                  A               Y         
##  Min.   :-3.61831   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:-0.68682   1st Qu.:0.612   1st Qu.:0.0000  
##  Median :-0.00980   Median :2.021   Median :0.0000  
##  Mean   :-0.00214   Mean   :2.124   Mean   :0.0661  
##  3rd Qu.: 0.67024   3rd Qu.:3.395   3rd Qu.:0.0000  
##  Max.   : 4.61816   Max.   :5.000   Max.   :1.0000
## Summary of A given W < -1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.408   2.752   2.694   4.110   5.000
## Summary of A given -1 < W <= 0:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.8729  2.2537  2.3142  3.6316  5.0000
## Summary of A given 0 < W <= 1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.4784  1.7923  1.9353  3.0472  5.0000
## Summary of A given 1 < W:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   1.147   1.531   2.612   5.000

n = 200

## [1] "The average fitting time for CV-HAL: 4.6812 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 6.3512 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 18.9695 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.708  0.720  0.726  0.744  0.769  0.781  0.811  0.827  0.840  0.866  0.888 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.905  0.910  0.926  0.925  0.933  0.934  0.946  0.955  0.960  0.975  0.978

n = 500

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.946  0.947  0.948  0.944  0.944  0.956  0.969  0.984  0.988  0.990  0.989 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.995  0.995  0.997  1.000  0.999  1.000  1.000  1.000  1.000  1.000  1.000

n = 1000

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.969  0.968  0.971  0.976  0.978  0.989  0.993  0.998  0.999  1.000  1.000 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000

Simulation 3

Data structure: \(O = (W, A, Y)\)

  • U - exogenous variables
  • W - baseline covariate that is a measure of body condition
  • A - treatment level based on W, continuous between 0 and 5
  • Y - outcome, indicator of an event

Underlying data generating process, \(P_{U,X}\)

  • Exogenous variables:
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 1^2)\)
    • \(U_A \sim Normal(\mu=0, \sigma^2 = 2^2)\)
    • \(U_Y \sim Uniform(min = 0, max = 1)\)
  • Structural equations F and endogenous variables:
    • \(W = U_W\)
    • \(A = bound(2 - 0.5W + U_A, min=0, max=5)\)
    • \(Y = \mathbf{I}[U_Y < expit(-10 - 3W + 4A + \mathbf{I}(A>2) * 5sin((0.8A)^2 - 2.6) )]\)

Outcome of interest: \(E_0[Y|a,W]\)

##        W                  A               Y         
##  Min.   :-3.61831   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:-0.68682   1st Qu.:0.612   1st Qu.:0.0000  
##  Median :-0.00980   Median :2.021   Median :0.0000  
##  Mean   :-0.00214   Mean   :2.124   Mean   :0.4369  
##  3rd Qu.: 0.67024   3rd Qu.:3.395   3rd Qu.:1.0000  
##  Max.   : 4.61816   Max.   :5.000   Max.   :1.0000
## Summary of A given W < -1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.408   2.752   2.694   4.110   5.000
## Summary of A given -1 < W <= 0:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.8729  2.2537  2.3142  3.6316  5.0000
## Summary of A given 0 < W <= 1:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.4784  1.7923  1.9353  3.0472  5.0000
## Summary of A given 1 < W:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   1.147   1.531   2.612   5.000

n = 200

## [1] "The average fitting time for CV-HAL: 3.7617 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 6.0892 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 24.4744 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.002  0.003  0.004  0.004  0.008  0.020  0.027  0.049  0.080  0.098  0.132 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.176  0.250  0.300  0.384  0.453  0.549  0.621  0.697  0.762  0.812  0.849

n = 500

## [1] "The average fitting time for CV-HAL: 5.9994 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 14.4206 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 84.0663 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.005  0.004  0.004  0.004  0.012  0.015  0.022  0.041  0.057  0.074  0.118 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.174  0.243  0.321  0.408  0.510  0.593  0.701  0.794  0.834  0.890  0.916

n = 1000

## [1] "The average fitting time for CV-HAL: 10.6242 seconds"
## [1] "The average fitting time for globally undersmoothed HAL: 22.6256 seconds"
## [1] "The average fitting time for locally undersmoothed HAL: 113.1588 seconds"

## [1] "proportion of simulations that cannot compute empirical SD:"
##    1.2    1.1      1 0.6952 0.4833  0.336 0.2336 0.1624 0.1129 0.0785 0.0546 
##  0.000  0.001  0.002  0.005  0.004  0.010  0.013  0.017  0.023  0.044  0.071 
## 0.0379 0.0264 0.0183 0.0127 0.0089 0.0062 0.0043  0.003 0.0021 0.0014  0.001 
##  0.099  0.149  0.228  0.307  0.396  0.504  0.564  0.640  0.690  0.765  0.787